Content reproduction system, content reproduction apparatus, program, content reproduction method, and providing content server

ABSTRACT

A method, apparatus, encoder, and decoder for receiving, transmitting, encoding and decoding content is provided. The method includes receiving a first segment of the content, the first segment having a first format, receiving, from a transmitting apparatus, a second segment of the content, the second segment having a second format, monitoring a network status between the receiving apparatus and the transmitting apparatus, and selecting the first segment or the second segment based on the monitored network status.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation of U.S. patent application Ser. No.14/327,348 (filed Jul. 9, 2014), which is a continuation of U.S. patentapplication Ser. No. 12/899,178 (filed Oct. 6, 2010, issued as U.S. Pat.No. 8,812,735 on Aug. 19, 2014), which claims priority of JapanesePatent Application No. 2009-238130, (filed Oct. 15, 2009), the entirecontent of which are all hereby incorporated by reference.

BACKGROUND

1. Technical Field

The present disclosure relates to a content reproduction system, acontent reproduction apparatus, a program, a content reproductionmethod, and providing a content server.

2. Description of the Related Art

Nowadays, HTTP (HyperText Transfer Protocol) for content transmissionand MP4 relating to content compression/encoding are widely used.According to HTTP, not only downloading of content, but also streamingthereof can be performed on the Internet. The HTTP streaming is alsoadopted by network media standards such as “DLNA guidelines” (2006) and“Open IPTV Forum” (2009). MP4 (ISO/IEC-14496-12, 14) can be used notonly as a storage format, but also as a transmission format fordownloading, streaming or the like.

For example, “IIS Smooth Streaming Technical Overview,” Alex Zambelli,Microsoft Corporation, March 2009 describes how to perform streaming ofcontent via the Internet by using HTTP and MP4. More specifically, “IISSmooth Streaming Technical Overview,” Alex Zambelli, MicrosoftCorporation, March 2009 describes that a server stores encoded files inthe MP4 format encoded at different bit rates and successively sendssegments constituting encoded files appropriate for network conditions.

However, the server side determines an encoded file a segment of whichis to be transmitted in a system in related art and thus, there is anissue that loads on the server side increase. Moreover, information suchas a time during which a segment is reproduced (a relative time from thestart of content) is not provided to the client, which makes itdifficult to perform a trick play such as variable-speed reproduction orto perform reproduction by jumping to the relative time (seekreproduction).

Accordingly, there is disclosed a method for transmitting content. Themethod may include encoding the content in first and second formats;storing the encoded content in first and second files; receiving arequest for a formatted segment, the formatted segment comprising aportion of the encoded data in the second file, and the requestincluding position information identifying a location of the formattedsegment; and transmitting the formatted segment.

In accordance with an embodiment, there is provided an apparatus fortransmitting content. The apparatus may include an encoder configured toencode the content in first and second formats; a storage unitconfigured to store the encoded content in first and second files; areceiver configured to receive a request for a formatted segment, theformatted segment comprising a portion of the encoded data in the secondfile, and the request including position information identifying alocation of the formatted segment; and a transmitter configured totransmit the formatted segment.

In accordance with an embodiment, there is provided a method forreceiving content in a receiving apparatus. The method may includereceiving a first segment of the content, the first segment having afirst format; receiving, from a transmitting apparatus, a second segmentof the content, the second segment having a second format; monitoring anetwork status between the receiving apparatus and the transmittingapparatus; and selecting the first segment or the second segment basedon the monitored network status.

In accordance with an embodiment, there is provided a method forencoding content. The method may include encoding the content togenerate content in a first format; encoding the content to generatecontent in a second format; processing portion information identifyingto a portion of the content in the second format; and adding the portioninformation to the content in the first format.

In accordance with an embodiment, there is provided a method fordecoding content. The method may include receiving encoded data, theencoded data including a first section comprising descriptioninformation and a second section comprising a first-format segmentcontaining content encoded in the first format, the descriptioninformation including position information; decoding the first-formatsegment of encoded content; and generating a request for a second-formatsegment of the encoded content, the second-format segment correspondingto the first-format segment and the request includes at least a portionof the position information.

In accordance with an embodiment, there is provided an apparatus forreceiving content in a receiving apparatus. The apparatus may include areceiving unit configured to receive, from a transmitting apparatus, afirst segment in a first format and a second segment in a second format,the first segment and the second segment including a portion of thecontent; a monitoring unit configured to monitor a network statusbetween the receiving apparatus and the transmitting apparatus; and aselecting unit configured to select the first segment or the secondsegment based on the monitored network status.

In accordance with an embodiment, there is provided an apparatus forencoding content. The apparatus may include an encoder configured toencode the content to generate content in a first format and a secondformat content; a processing unit configured to process portioninformation identifying a portion of the content in the second format;and an adding unit configured to add the portion information to thecontent in the first format.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an explanatory view showing the configuration of a contentreproduction system according to an embodiment of the present invention;

FIG. 2 is an explanatory view showing the flow of data in the contentreproduction system according to the present embodiment;

FIG. 3 is a block diagram showing the hardware configuration of acontent reproduction apparatus;

FIG. 4 is a function block diagram showing the configuration of acontent server according to the present embodiment;

FIG. 5 is an explanatory view showing the configuration of a general MP4file;

FIG. 6 is an explanatory view showing the configuration of an MP4 filegenerated by a file generation unit in the present embodiment;

FIG. 7 is an explanatory view showing a modification of the MP4 filegenerated by the file generation unit in the present embodiment;

FIG. 8 is a function block diagram showing the configuration of acontent reproduction apparatus according to the present embodiment;

FIG. 9 is a sequence diagram showing an operation of the contentreproduction system according to the present embodiment;

FIG. 10 is an explanatory view showing a modification of the MP4 filegenerated by the file generation unit in the present embodiment;

FIG. 11 is an explanatory view showing a modification of the MP4 filegenerated by the file generation unit in the present embodiment; and

FIG. 12 is an explanatory view showing a modification of the MP4 filegenerated by the file generation unit in the present embodiment.

DETAILED DESCRIPTION OF THE EMBODIMENT(S)

Hereinafter, preferred embodiments of the present invention will bedescribed in detail with reference to the appended drawings. Note that,in this specification and the appended drawings, structural elementsthat have substantially the same function and structure are denoted withthe same reference numerals, and repeated explanation of thesestructural elements is omitted.

“DETAILED DESCRIPTION OF THE EMBODIMENT” will be described according tothe order shown below:

1. Overview of Content Reproduction System

2. Hardware Configuration of Content Reproduction Apparatus

3. Function of Content Server

4. Function of Content Reproduction Apparatus

5. Operation of Content Reproduction System

6. Modifications

7. Conclusion

1. OVERVIEW OF CONTENT REPRODUCTION SYSTEM

First, a content reproduction system 1 according to an embodiment of thepresent invention will schematically be described with reference toFIGS. 1 and 2.

FIG. 1 is an explanatory view showing the configuration of a contentreproduction system according to an embodiment of the present invention.As shown in FIG. 1, the content reproduction system 1 according to anembodiment of the present invention includes a content server 10 (e.g.,transmitting apparatus), a network 12, and a content reproductionapparatus 20 (e.g., client and/or receiving apparatus).

The content server 10 and the content reproduction apparatus 20 areconnected via the network 12. The network 12 is a wire or wirelesstransmission path of information transmitted from an apparatus connectedto the network 12.

The network 12 may contain, for example, a public network such as theInternet, a telephone network, and a satellite communication network orLAN (Local Area Network) or WAN (Wide Area Network) including Ethernet(registered trademark). The network 12 may also contain a leased linenetwork such as IP-VPN (Internet Protocol-Virtual Private Network).

The content server 10 encodes content data to generate and store a datafile containing encoded data (e.g., first-format segments and/or secondformat segments) and meta-information (e.g., description informationand/or portion information) of the encoded data. When the content server10 generates a data file in the MP4 format, encoded data corresponds to“mdat” and meta-information corresponds to “moov”.

Content data may be music data of music, lectures, radio programs andthe like, video data of movies, TV programs, video programs, photos,documents, pictures, charts and the like, games, software and the like.

The content server 10 according to the present embodiment generates aplurality of data files from the same content at different bit rates(e.g., compression formats). Relevant points will be described morespecifically below with reference to FIG. 2.

FIG. 2 is an explanatory view showing the flow of data in the contentreproduction system 1 according to the present embodiment. The contentserver 10 encodes the same content data at different bit rates togenerate, for example, as shown in FIG. 2, a file A at 2 Mbps, a file Bat 1.5 Mbps, and a file C at 1 Mbps. The file A is relatively at a highbit rate, the file B at a standard bit rate, and the file C at a low bitrate.

Also as shown in FIG. 2, encoded data of each file is divided into aplurality of segments. For example, encoded data of the file A isdivided into segments (e.g., first format segments) “A1”, “A2”, “A3”, .. . , “An”, encoded data of the file B into segments (e.g., secondformat segments) “B1”, “B2”, “B3”, . . . , “Bn”, and encoded data of thefile C into segments “C1”, “C2”, “C3”, . . . , “Cn”.

Each segment is constituted by samples constituted by one or two or morepieces of video encoded data and audio encoded data that begin with syncsamples (for example, IDR-pictures for video encoding of AVC/H.264) ofMP4 and can be reproduced alone. If, for example, video data of 30frames/second is encoded by GOP (Group of Picture) of 15 frames fixedlength, each segment may be video and audio encoded data of 2 secondscorresponding to 4 GPO or video and audio encoded data of 10 secondscorresponding to 20 GPO.

Reproduction ranges (ranges of time positions from the start of content)by segments whose arrangement order in each file is the same are thesame. For example, the reproduction range of the segment “A2”, that ofthe segment “B2”, and that of the segment “C2” are the same and if eachsegment is encoded data of two seconds, the reproduction ranges of thesegment “A2”, the segment “B2”, and the segment “C2” are all 2 secondsto 4 seconds of content.

After generating the file A to the file C each constituted by theplurality of segments, the content server 10 stores the file A to thefile C. Then, as shown in FIG. 2, the content server 10 sequentiallysends segments constituting different files to the content reproductionapparatus 20 and the content reproduction apparatus 20 reproduces thereceived segments as streaming.

A display apparatus is shown in FIG. 1 as an example of the contentreproduction apparatus 20, but the content reproduction apparatus 20 isnot limited to such an example. For example, the content reproductionapparatus 20 may be an information processing apparatus such as a PC(Personal Computer), home video processing apparatus (such as a DVDrecorder and VCR), PDA (Personal Digital Assistant), home game machine,and home electric appliance. Alternatively, the content reproductionapparatus 20 may be an information processing apparatus such as a mobilephone, PHS (Personal Handyphone System), portable music reproducingapparatus, portable video processing apparatus, and portable gamemachine.

It is desirable that segments in accordance with network conditions(e.g., network status) are transmitted from the content server 10. Forexample, it is suitable to transmit high-bit-rate segments (for example,segments constituting the file A) if the network has sufficient bandsand low-bit-rate segments (for example, segments constituting the fileC) if the network does not have sufficient bands.

However, there is an issue that loads on the content server 10 grow ifthe content server 10 monitors network conditions and selects segmentsin accordance with network conditions.

Thus, the above background led to the creation of the contentreproduction system 1 according to the present embodiment. According tocontent reproduction system 1 in the present embodiment, adaptivestreaming can be realized while reducing loads on the server side.

Further, according to the content reproduction system 1 in the presentembodiment, most of standards such as HTTP and MP4 are supported andalso compatibility with existing apparatuses can be maintained. Thecontent reproduction apparatus 20 and the content server 10 constitutingthe content reproduction system 1 according to the present embodimentwill be described below in detail.

2. HARDWARE CONFIGURATION OF CONTENT REPRODUCTION APPARATUS

FIG. 3 is a block diagram showing the hardware configuration of thecontent reproduction apparatus 20. The content reproduction apparatus 20includes a CPU (Central Processing Unit) 201, a ROM (Read Only Memory)202, a RAM (Random Access Memory) 203, and a host bus 204. The contentreproduction apparatus 20 also includes a bridge 205, an external bus206, an interface 207, an input device 208, an output device 210, astorage device (HDD) 211, a drive 212, and a communication device 215.

The CPU 201 functions as an arithmetic processing apparatus and acontrol apparatus to control overall operations of the contentreproduction apparatus 20 according to various programs. The CPU 201 maybe a microprocessor, a processing unit, an adding unit, and/or a requestunit. The ROM 202 stores programs, arithmetic parameters and the likeused by the CPU 201. The RAM 203 temporarily stores programs used forexecution by the CPU 201 and parameters that appropriately change duringexecution thereof. These units are mutually connected by the host bus204 composed of a CPU bus or the like.

The host bus 204 is connected to the external bus 206 such as a PCI(Peripheral Component Interconnect/Interface) bus via the bridge 205.Incidentally, the host bus 204, the bridge 205, and the external bus 206are not necessarily constituted separately and these functions may beimplemented by one bus.

The input device 208 is constituted by an input means used by a user toinput information such as a mouse, keyboard, touch panel, button,microphone, switch, and lever and an input control circuit thatgenerates an input signal based on input by the user and outputs theinput signal to the CPU 201. The user of the content reproductionapparatus 20 can input various kinds of data into the contentreproduction apparatus 20 and issue instructions of a processingoperation by operating the input device 208.

The output device 210 contains, for example, a display device such as aCRT (Cathode Ray Tube) display device, liquid crystal display (LCD)device, OLED (Organic Light Emitting Diode) device, and lamp. Further,the output device 210 contains an audio output device such as a speakerand headphone. The output device 210 outputs, for example, reproducedcontent. More specifically, the display device displays various kinds ofinformation such as reproduced video data as text or images. The audiooutput device, on the other hand, converts reproduced audio data or thelike into sound and outputs the sound.

The storage device 211 is a device for data storage constituted as anexample of the storage unit of the content reproduction apparatus 20according to the present embodiment. The storage device 211 may containa storage medium, a recording device that records data in the storagemedium, a reading device that reads data from the storage medium, or adeletion device that deletes data recorded in the storage medium. Thestorage device 211 is constituted by, for example, an HDD (Hard DiskDrive). The storage device 211 drives the hard disk and stores programsexecuted by the CPU 201 and various kinds of data.

The drive 212 is a reader writer for storage medium and is attached tothe content reproduction apparatus 20 internally or externally. Thedrive 212 reads information recorded in an inserted removable storagemedium 24 such as a magnetic disk, optical disk, magneto-optical disk,and semiconductor memory and outputs the information to the RAM 203. Thedrive 212 can also write information into the removable storage medium24.

The communication device 215 is a communication interface constitutedby, for example, communication devices for connecting to the network 12.The communication device 215 may be a wireless LAN (Local Area Network)compatible communication device, LTE (Long Term Evolution) compatiblecommunication device, or wire communication device that performscommunication by wire.

In the foregoing, the hardware configuration of the content reproductionapparatus 20 has been described with reference to FIG. 3. Hardware ofthe content server 10 can be constituted substantially in the samemanner as that of the content reproduction apparatus 20 and thus, adescription thereof is omitted.

3. FUNCTION OF CONTENT SERVER

Next, the function of the content server 10 according to the presentembodiment will be described with reference to FIGS. 4 to 7.

FIG. 4 is a function block diagram showing the configuration of thecontent server 10 according to the present embodiment. As shown in FIG.4, the content server 10 according to the present embodiment includes afile generation unit 120, a storage unit 130, and a communication unit140.

The file generation unit 120 includes an encoder 122 that encodescontent data to generate an MP4 file containing encoded data andmetadata thereof. More specifically, the file generation unit 120generates a plurality of MP4 files having encoded data at different bitrates from the same content. The configuration of a general MP4 filewill be described below with reference to FIG. 5 and then, theconfiguration of an MP4 file generated by the file generation unit 120in the present embodiment will be described.

FIG. 5 is an explanatory view showing the configuration of a general MP4file. As shown in FIG. 5, the MP4 file contains “moov” and “mdat”.“mdat” is encoded data of video and audio. In the present embodiment, H.264/AVC is used for video encoding and HE-AAC for audio encoding. “moov”contains access information (e.g., description information and/or,portion information) to each segment contained in “mdat” such as “trak(video)” and “trak (audio)”. The access information includes, forexample, location information (byte offset) of each sample andreproduction time information.

“dinf” is defined in MP4 as a data box to refer to other external files.If, as shown in FIG. 5, “moov” refers to “mdat” contained in the sameMP4 file, the value of “dinf” is “null”. In the present embodiment, bycontrast, as will be described with reference to FIG. 6, a noticeableeffect can be achieved by making full use of this “dinf”.

FIG. 6 is an explanatory view showing the configuration of an MP4 filegenerated by the file generation unit 120 in the present embodiment. Asshown in FIG. 6, the file generation unit 120 generates a plurality ofMP4 file A to MP4 file C containing “mdat” at different bit rates fromthe same content.

In the present embodiment, segments are data divided by a boundary ofMP4 Sync Sample of video and video encoded data and audio encoded dataare arranged in a segment after being interleaved. Segments arecontinuously arranged in mdat in the time sequence in which content isreproduced. Video and audio are encoded so as to yield the samereproduction time of segments of each data file at different bit rates.In the case of AVC/H. 264, video encoded data and audio encoded data arearranged in such a way that an IDR picture is present at the head of asegment, so that data can be switched to data at a different bit rate insegments.

The position of each segment is the position of Sync Sample and thecontent reproduction apparatus 20 can read segment data from each datafile based on the segment position obtained from information of SampleDescription box in “moov” or in combination with Sync sample table boxcontained therein. In the present embodiment, one video frame is set tobe one Sample to create a Sync Sample, which is a Sample in which an IDRpicture is present once in 30 frames, and Sync sample table box isprovided in Sample Description box.

“mdat” of the MP4 file B (first data file) is constituted by segments B1to Bn whose bit rate is 1.5 Mbps, “mdat” of the MP4 file C (second datafile) is constituted by segments C1 to Cn whose bit rate is 1 Mbps, and“mdat” of the MP4 file A (third data file) is constituted by segments A1to An whose bit rate is 2 Mbps.

“moov” of the MP4 file B contains “trak (videoB)” and “trak (audioB)” toaccess the segments B1 to Bn constituting the same file.

Further, “moov” of the MP4 file B contains “trak (videoC′)” and “trak(audioC′)” to access the segments C1 to Cn constituting the MP4 file C.

That is, the URL of the MP4 file C is described in “dinf” of “trak(videoC′)” and “trak (audioC′)”. More specifically, the URL of the MP4file C is described in the ‘location’ field in the syntax of “dinf”shown below. Moreover, position information (byte offset in a file) ofeach Sample and Sync Sample segments C1 to Cn is obtained frominformation of Sample Description Box of a video track described in“trak (videoC′)” and “trak (audioC′)”.

(Syntax example) aligned(8) class DataEntryUrlBox (bit(24) flags)extends FullBox(  ‘url  ’ ,    version = 0, flags) {    string location;}

Similarly, “moov” of the MP4 file B contains “trak (videoA′)” and “trak(audioA′)” to access the segments A1 to An constituting the MP4 file A.That is, the URL of the MP4 file A is described in “dinf” of “trak(videoA′)” and “trak (audioA′)”.

While the MP4 file A also contains “trak (videoA)” and “trak (audioA)”to access the segments A1 to An constituting the MP4 file A, the contentreproduction apparatus 20 does not use these for adaptive streamingdescribed later.

Similarly, while the MP4 file C also contains “trak (videoC)” and “trak(audioC)” to access the segments C1 to Cn constituting the MP4 file C,the content reproduction apparatus 20 does not use these for adaptivestreaming described later.

In the present embodiment, as described above, “mdat” having differentbit rates are created in different MP4 files rather than the same MP4file. Moreover, the URL and offset information of each segment in a fileto refer to “mdat” contained in other MP4 files are described in SampleDescription box of one MP4 file.

With such a configuration, an MP4 file according to the presentembodiment can be used not only for streaming, but also for downloading.The reason therefor will be described by comparing with a case where aplurality of “mdat” having different bit rates is generated in the samefile.

If the plurality of “mdat” having different bit rates is generated inthe same file and the file is also used for downloading, the client willdownload the whole file containing the plurality of “mdat”. Thus, anissue arises that the amount of download data and the download time willunnecessarily double.

In the present embodiment, by contrast, an MP4 file containing only one“mdat” among the plurality of “mdat” with different bit rates can bedownloaded. For example, the content reproduction apparatus 20 candownload, among the plurality of “mdat” with different bit rates, theMP4 file A containing only “mdat” at a high bit rate. Therefore, theclient can download while curbing the amount of download data and thedownload time.

The file generation unit 120 may write information whether media datareferred to by each “trak” belongs to a group of alternative media dataobtained by encoding at different bit rates into “minfo” of each trackin “moov” of the file B. For example, the following extended block maybe provided in the syntax of “minfo” shown below to write theidentification number of a group of alternative media data into“alternative media group”, “<uuid_value>: T. B. D” into “extended_type”,and “0” into “flags”. The content reproduction apparatus 20 canrecognize that segments of media data belonging to a group ofalternative media data can be replaced by compatible segments in othermedia data belonging to the same group. The maximum bit rate maxbitrateand the average bit rate avgbitrate of media are also described, whichcan be used by the content reproduction apparatus 20 to determine theencoded data segments of which are to be acquired.

(Syntax example) aligned(8) class AlternateMediaInformationBox extendsFullBox ( ‘uuid’ , version=0,    flags = 0, extended_type){      unsigned int(32) alternative_media_group;       unsigned int(32)maxbitrate;       unsigned int(32) avgbitrate; }

With such a configuration, the content reproduction apparatus 20 candetermine whether an MP4 file is generated according to a method in thepresent embodiment by checking “minfo” in “moov” of the MP4 file. Then,if the MP4 file is a file generated according to a method in the presentembodiment, the content reproduction apparatus 20 can request, asdescribed later, adaptive streaming from the content server 10.

An example in which an MP4 file is mainly constituted by “moov” and“mdat” is shown in FIG. 6, but the configuration of an MP4 file is notlimited to such an example. For example, access information contained in“moov” shown in FIG. 6 may be arranged, as shown in FIG. 7, in adistributed manner by using “moov” and “moof”.

FIG. 7 is an explanatory view showing a modification of the MP4 filegenerated by the file generation unit 120 in the present embodiment. Asshown in FIG. 7, “moov” is arranged at the head of each file and then,“mdat” and “moof” are arranged alternately. Like the structure of an MP4file described above, “moov” of the MP4 file B contains “trak” in whichaccess information to each segment of the MP4 files B, A, and C andSample Description box to access subsequent “mdat”. Each “moof” of theMP4 file B contains a plurality of “traf” corresponding to “trak”described in “moov” and “traf” contains information to access eachsegment of “mdat” subsequent to each file. The MP4 files C and A mayalso have “moov” and “moof” described therein, but like the aboveexample, the content reproduction apparatus 20 does not use these foradaptive streaming.

By arranging access information in a distributed manner, the amounts ofdata of “moov” at the head of the MP4 file B and each “moof” can be madesmaller, so that the acquisition time of “moov” at the head can becurbed and information of “moov” and “moof” held by the contentreproduction apparatus 20 in a buffer 230 can be reduced. Moreover,“moof” and corresponding mdat can be generated independently and thuscan be used for streaming of live content such as live broadcasting. Thepresent embodiment is also applicable to the format shown in FIG. 7 inwhich “moov”, “mod”, and “mdat” are arranged in a distributed manner.

Return to the description of the configuration of the content server 10by referring to FIG. 4. The storage unit 130 of the content server 10shown in FIG. 4 is a storage medium that stores a plurality of MP4 filesgenerated by the file generation unit 120.

For example, the storage unit 130 may be a storage medium such as anonvolatile memory, magnetic disk, optical disk, and MO (MagnetoOptical) disk. The nonvolatile memory includes, for example, an EEPROM(Electrically Erasable Programmable Read-Only Memory) and EPROM(Erasable Programmable ROM). The magnetic disk includes a hard disk anddisc-like magnetic disk. The optical disk includes a CD (Compact Disc),DVD-R (Digital Versatile Disc Recordable), and BD (Blu-ray Disc(registered trademark)).

The communication unit 140 is an interface with the content reproductionapparatus 20 and communicates with the content reproduction apparatus 20via the network 12. More specifically, the communication unit 140 has afunction as an HTTP server that communicates with the contentreproduction apparatus 20 according to HTTP. For example, thecommunication unit 140 extracts data requested from the contentreproduction apparatus 20 according to HTTP from the storage unit 130and transmits the data to the content reproduction apparatus 20 as anHTTP response.

4. FUNCTION OF CONTENT REPRODUCTION APPARATUS

In the foregoing, the function of the content server 10 according to thepresent embodiment has been described. Next, the function of the contentreproduction apparatus 20 according to the present embodiment will bedescribed with reference to FIG. 8.

FIG. 8 is a function block diagram showing the configuration of thecontent reproduction apparatus 20 according to the present embodiment.As shown in FIG. 8, the content reproduction apparatus 20 according tothe present embodiment includes an acquisition unit 220, the buffer 230,a reproduction unit 240, and a selection unit 250.

The acquisition unit 220 is an interface with the content server 10 andrequests data from the content server 10 to acquire the data from thecontent server 10. More specifically, the acquisition unit 220 has afunction as an HTTP client that communicates with the contentreproduction apparatus 20 according to HTTP. For example, theacquisition unit 220 can partially acquire a portion (moov or a segment)of an MP4 file from the content server 10 by using HTTP Range.

The buffer 230 sequentially buffers segments acquired by the acquisitionunit 220 from the content server 10. Segments buffered in the buffer 230are sequentially supplied to the reproduction unit 240 according to FIFO(First In First Out).

The reproduction unit 240 sequentially reproduces segments supplied fromthe buffer 230. More specifically, the reproduction unit 240 performssegment decoding, DA conversion, and rendering.

The selection unit 250 sequentially selects from within the same contentan MP4 file a segment of which is to be acquired, that is, a segmenthaving a bit rate to be acquired in accordance with conditions of thenetwork 12. If, for example, the selection unit 250 successively selectssegments “A1”, “B2”, and “A3”, as shown in FIG. 2, the acquisition unit220 successively acquires the segments “A1”, “B2”, and “A3” from thecontent server 10.

The acquisition unit 220 acquires “moov” of an MP4 file prior to theacquisition of segments and a segment selected by the selection unit 250can be acquired from the content server 10 by specifying accessinformation contained in the “moov”.

If the band of the network 12 grows, the amount of buffering data in thebuffer 230 is assumed to increase and if the band of the network 12shrinks, the amount of buffering data in the buffer 230 is assumed todecrease. Thus, the selection unit 250 may indirectly grasp conditionsof the network 12 by monitoring buffering conditions of the buffer 230.

If, for example, the number of samples (the number of video frames)buffered in the buffer 230 is within a predetermined range, that is, ifthe reproducible time by samples buffered in the buffer 230 is within apredetermined range, the selection unit 250 may select segments at thestandard bit rate (for example, 1.5 Mbps). For example, the contentreproduction apparatus 20 starts reproduction of streaming aftertemporarily accumulating 90 samples at the standard bit rate (for threeseconds) and continues the reproduction while reading subsequent segmentdata and if data in the buffer 230 during reproduction is in the rangeof 75 to 105 samples, the selection unit 250 selects segments at thestandard bit rate.

If, on the other hand, the buffering amount decreases and thereproducible time by samples buffered in the buffer 230 falls below thepredetermined range, the selection unit 250 may select segments at a lowbit rate (for example, 1 Mbps). If, for example, data in the buffer 230during reproduction falls to 75 samples or less, the selection unit 250selects segments at a low bit rate.

If the buffering amount increases and the reproducible time by samplesbuffered in the buffer 230 exceeds the predetermined range, theselection unit 250 may select segments at a high bit rate (for example,2 Mbps). If, for example, data in the buffer 230 during reproductionincreases to 105 samples or more, the selection unit 250 selectssegments at a high bit rate. Further, if the number of segments in thebuffer 230 reaches 120 so that segments are sufficiently accumulated,the selection unit 250 temporarily stops reading and when the numberthereof falls 120 or below, the selection unit 250 restarts reading.

In the foregoing, as an example of the method for determining the bandof the network 12, an example to monitor buffering conditions of thebuffer 230 has been described, but the present embodiment is not limitedto such an example. For example, the content reproduction apparatus 20may determine the band of the network 12 by actually transmitting adummy packet to the network 12 or may determine the band of the network12 based on the acquisition speed of segments by the acquisition unit220.

5. OPERATION OF CONTENT REPRODUCTION SYSTEM

In the foregoing, the functions of the content server 10 and the contentreproduction apparatus 20 according to the present embodiment have beendescribed. Next, the operation of the content reproduction system 1according to the present embodiment will be described with reference toFIG. 9.

FIG. 9 is a sequence diagram showing the operation of the contentreproduction system 1 according to the present embodiment. First, theacquisition unit 220 of the content reproduction apparatus 20 requeststhe transmission of “moov” of the MP4 file B concerning some contentthrough “HTTP: GET URL-B with Range” from the content server 10 (S304).Then, the communication unit 140 of the content server 10 transmits“moov” of the MP4 file B to the content reproduction apparatus 20 as“HTTP: Response” (S308). It is assumed that URL-B of the MP4 file B isdescribed in metadata information of the content and the contentreproduction apparatus 20 has acquired the content. Then, the buffer 230of the content reproduction apparatus 20 starts buffering of “moov” ofthe MP4 file B acquired from the content server 10 (S310).

Here, the selection unit 250 of the content reproduction apparatus 20can determine whether a referred file of “trak” in “moov” belongs to analternative media group obtained by encoding at different bit rates bychecking “minfo” in “moov”.

Then, if the referred file of “trak” in “moov” belongs to an alternativemedia group obtained by encoding at different bit rates, the selectionunit 250 selects a segment Bi of the MP4 file B having the standard bitrate.

Next, the acquisition unit 220 requests the segment Bi of the MP4 file Bselected by the selection unit 250 from the content server 10 by using“HTTP: GET URL-B with Range” (S312). More specifically, the acquisitionunit 220 requests the segment Bi of the MP4 file B from the contentserver 10 by specifying network position information of the MP4 file Band position information of the segment Bi in the MP4 file B in bytes.The network position information of the MP4 file B and the positioninformation of the segment Bi in the MP4 file B in bytes are describedin “moov” of the MP4 file B received in step S308. Then, thecommunication unit 140 of the content server 10 transmits the segment Biof the MP4 file B to the content reproduction apparatus 20 as “HTTP:Response” (S316).

Then, when the segment Bi is sufficiently buffered in the buffer 230 ofthe content reproduction apparatus 20, the reproduction unit 240 startsreproduction of the segment Bi (S320). If it is difficult to read fromthe buffer sufficiently even when a certain time passes after startingbuffering (S310), the network band can be considered to be insufficient.In such a case, subsequent segment reading may be switched to segmentsin the file C from S316. Similarly, if predetermined segments aredetermined to be bufferable earlier, it is also possible to startreproduction after segments of the file A being buffered (S320).

Similarly, the acquisition unit 220 of the content reproductionapparatus 20 requests the next segment Bj from the content server 10 byusing “HTTP: GET URL-B with Range” (S324). Then, the communication unit140 of the content server 10 transmits the next segment Bj to thecontent reproduction apparatus 20 as “HTTP: Response” (S328).

If the buffering amount of the buffer 230 decreases and the reproducibletime by samples buffered in the buffer 230 falls below a predeterminedrange (S332), the selection unit 250 selects a segment Ck of the MP4file C having a low bit rate.

Then, the acquisition unit 220 requests the segment Ck of the MP4 file Cselected by the selection unit 250 from the content server 10 by using“HTTP: GET URL-C with Range” (S336). The communication unit 140 of thecontent server 10 that has received the request transmits the segment Ckof the MP4 file C to the content reproduction apparatus 20 as “HTTP:Response” (S340).

Then, if the buffering amount of the buffer 230 increases and thereproducible time by samples buffered in the buffer 230 falls within thepredetermined range (S344), the selection unit 250 selects the segmentB1 of the MP4 file B having the standard bit rate.

Next, the acquisition unit 220 requests the segment B1 of the MP4 file Bselected by the selection unit 250 from the content server 10 by using“HTTP: GET URL-B with Range” (S348). Then, the communication unit 140 ofthe content server 10 transmits the segment B1 of the MP4 file B to thecontent reproduction apparatus 20 as “HTTP: Response” (S352).

If the buffering amount of the buffer 230 increases still thereafter andthe reproducible time by samples buffered in the buffer 230 exceeds thepredetermined range (S356), the selection unit 250 selects a segment Amof the MP4 file A having a high bit rate.

Next, the acquisition unit 220 requests the segment Am of the MP4 file Aselected by the selection unit 250 from the content server 10 by using“HTTP: GET URL-A with Range” (S360). Then, the communication unit 140 ofthe content server 10 transmits the segment Am of the MP4 file A to thecontent reproduction apparatus 20 as “HTTP: Response” (S352).

Hereinafter, the selection unit 250 similarly selects a segment having abit rate to be requested in accordance with the buffering amount of thebuffer 230, and the acquisition unit 220 acquires the segment selectedby the selection unit 250 from the content server 10.

With such a configuration, reproduction can be prevented from beingbroken off when the band of the network 12 is small and high-qualityreproduction can be realized when the band of the network 12 is large.Moreover, in the present embodiment, loads on the content server 10 canbe reduced because the band of the network 12 can be determined and thesegment to be requested can be selected from the content reproductionapparatus 20 side.

6. MODIFICATIONS

An example that enables access to “mdat” of another file by using “dinf”in “trak” is described above, but as described with reference to FIG.10, reference to “trak” of another file may be enabled by using “trak”.

FIG. 10 is an explanatory view showing a modification of the MP4 filegenerated by the file generation unit 120 in the present embodiment. If,as shown in FIG. 10, access information to “trak” of the MP4 file A iswritten into “trak” of the MP4 file B, the content reproductionapparatus 20 can acquire “trak” of the MP4 file A by analyzing “trak” ofthe MP4 file B and using the described access information. Thus, thecontent reproduction apparatus 20 can acquire the segments A1, A2, . . .based on “trak” of the MP4 file A and Sample Description box describedtherein.

Similarly, if access information to “trak” of the MP4 file C is writteninto “trak” of the MP4 file B, the content reproduction apparatus 20 canacquire “trak” of the MP4 file C by analyzing “trak” of the MP4 file Band using the described access information. Thus, the contentreproduction apparatus 20 can also acquire the segments C1, C2, . . .based on “trak” of the MP4 file C and Sample Description box describedtherein.

More specifically, the MP4 file format may be extended to write anextended box shown below into “minfo”, “<uuid_value>: T. B. D” into“extended_type” in the syntax, the URL of the referred MP4 file into“location”, and the identifier of “trak” in the referred MP4 file into“track_ID”. Accordingly, the content reproduction apparatus 20 canrecognize that alternative media data as media data on a track of thefile B is located on a track indicated by track_id of the file C.Moreover, bit rate information such as the maximum bit rate maxbitrateand the average bit rate avgbitrate of media are also described, whichcan be used by the content reproduction apparatus 20 to determine theencoded data segments of which are to be acquired.

(Syntax example) aligned(8) class AlternateMediaReferenceBox extendsFullBox(  ‘uuid’ , version=0,     flags = 0, extended_type){unsignedint(32) entry_count;    for (i=1; i · entry_count; i++) {       stringlocation;  // URL       unsigned int(32) track_ID;       unsignedint(32) maxbitrate;       unsigned int(32) avgbitrate;    } }

The above configuration is similarly applicable to a file format inwhich access information contained in “moov” is arranged in adistributed manner by using “moov” and “moof”. In this case, as shown inFIG. 11, “trak” and “traf” of another file can be accessed using “trak”of the MP4 file B by writing access information to “trak” of the otherfile into “trak”.

FIG. 11 is an explanatory view showing a modification of the MP4 filegenerated by the file generation unit 120 in the present embodiment. Asshown in FIG. 11, if access information to “trak” of the MP4 file A iswritten into “trak” of the MP4 file B, the content reproductionapparatus 20 can acquire “trak” of the MP4 file A by analyzing “trak” ofthe MP4 file B and using the described access information. Thus, thecontent reproduction apparatus 20 can also acquire segments A11, A12, .. . based on “trak” of the MP4 file A.

Similarly, if access information to “trak” of the MP4 file C is writteninto “trak” of the MP4 file B, the content reproduction apparatus 20 canacquire “trak” of the MP4 file C by analyzing “trak” of the MP4 file Band using the described access information. Thus, the contentreproduction apparatus 20 can also acquire segments C11, C12, . . .based on “trak” of the MP4 file C and each “traf”. While the position inthe file of “moof” of each file can be acquired by the BOX structure ofan MP4 file being analyzed by the content reproduction apparatus 20,position information of each moof may be acquired by using MovieFragment Random access box described in the MP4 file to access, afterthe relevant moof information being acquired, each segment of mdatsubsequent to the moof. Moreover, mdat immediately after “moof” can beread without time delay by reading moof information in advance andanalyzing “traf”.

7. CONCLUSION

In the present embodiment, as described above, the selection unit 250 ofthe content reproduction apparatus 20 selects segments having the bitrate to be requested in accordance with the band of the network 12 andthe acquisition unit 220 acquires the selected segment from the contentserver 10. Therefore, according to the present embodiment, loads on thecontent server 10 can be reduced.

The present embodiment mostly conforms to existing standards such asHTTP and MP4. Therefore, the present embodiment is compatible withstreaming using existing HTTP and MP4 and can minimize extensions sothat smooth introduction thereof can be expected.

Moreover, in the present embodiment, “mdat” having different bit ratesare created in different MP4 files rather than in the same MP4 file.Thus, each MP4 file can be used not only for streaming, but also fordownloading without hindrance.

It should be understood by those skilled in the art that variousmodifications, combinations, sub-combinations and alterations may occurdepending on design requirements and other factors insofar as they arewithin the scope of the appended claims or the equivalents thereof.

For example, each step of processing of the content reproduction system1 herein is not necessarily executed chronologically in the orderdescribed as a sequence diagram. For example, each step of processing ofthe content reproduction system 1 may be executed in an order differentfrom the order described as a sequence diagram or in parallel.

A computer program to cause hardware such as the CPU 201, the ROM 202,and the RAM 203 contained in the content reproduction apparatus 20 andthe content server 10 to perform the function equivalent to that of eachcomponent of the content reproduction apparatus 20 and the contentserver 10 described above can be created. Moreover, a storage medium inwhich the computer program is stored is also provided.

In the present embodiment, as shown in FIGS. 6, 7, 10, and 11, encodeddata at the standard bit rate is arranged in the first data file, butencoded data at a low bit rate or a high bit rate may also be arranged.

In the present embodiment, as shown in FIGS. 6, 7, 10, and 11, encodeddata is arranged in the first data file, but only access information tosuch encoded data may be arranged in moof of the first data file.

In the present embodiment, as shown in FIG. 7, an example in which“moov”, “moof”, and “mdat” are arranged in a distributed manner isshown, but distributed arrangement may be limited to the first data fileso that, as shown in FIG. 8, other data files are constituted by “moov”and “mdat” corresponding thereto.

Further, FIG. 12 shows an embodiment when the first data file does notcontain encoded data. The first data file has access information to eachsegment arranged in other data files described therein. Accessinformation is arranged in the first data file in a distributed mannerby using “moov” and “moof” and each “moof” has only access informationto segments of only one data file described therein.

In this case, “traf” of each of a video track and an audio track hasaccess information to each segment described in each “moof” and accessinformation to segments in a range of sets of “moof” arrangedconsecutively (three sets in this case) described therein.

In the example shown in FIG. 12, each “trak” of “moov” does not containaccess information to segments and the next three “moof” have accessinformation from segment 1 to segment (i−1) described therein.Similarly, the next three “moof” have access information from segment ito segment (j−1) described therein and further, the next three “moof”have access information from segment j to segment (k−1) describedtherein. The arrangement order of “trak” in “moov” (that is, B, C, A)and the arrangement order of “traf” in three “moof” (that is, B, C, A)match, which makes reading of “traf” easier.

By configuring the first data file in this manner, access information tosegments can easily be obtained only by analyzing the first data file.Moreover, segment information of each data file is divided in units of“moof” and thus, the content reproduction apparatus 20 can performadaptive streaming while selecting a data file of the appropriate bitrate matching network conditions by acquiring and holding only “moof” ofa necessary data file without holding access information to segments ofall data files.

Data files that do not contain encoded data are not distributed by“moof” and are constituted by “moov” and “mdat” and thus, such datafiles can be used for a content reproduction apparatus that onlysupports streaming using existing HTTP and MP4.

By considering issues such as being unable to reproduce by an existingcontent reproduction apparatus because the first data file does notcontain encoded data, a mechanism may be provided to reproduce a firstMP4 file if a content reproduction apparatus is provided for adaptivestreaming and otherwise, an MP4 file that is not distributed isreproduced. For example, a method by which a content reproductionapparatus is caused to disclose each URL and attributes thereof toselect the URL based on capability and attributes of the contentreproduction apparatus is known.

The overview and specific examples of the above-described embodiment andthe other embodiments are examples. The present invention may also beapplied and can be applied to various other embodiments. It should beunderstood by those skilled in the art that various modifications,combinations, sub-combinations and alterations may occur depending ondesign requirements and other factors insofar as they are within thescope of the appended claims or the equivalents thereof.

What is claimed is:
 1. An information processing apparatus, comprising:processing circuitry configured to transmit information associated withelectronic content to a device, wherein the electronic content, storedin a plurality of different bit rates, comprises a plurality of segmentsand the information associated with the electronic content compriseslocation information for each segment of the plurality of segments;receive a request from the device for at least one segment of the storedelectronic content selected by the device based on the locationinformation for each segment of the plurality of segments; and transmitthe requested at least one segment to the device.
 2. The informationprocessing apparatus according to claim 1, wherein the locationinformation comprises a respective uniform resource locator (URL)corresponding to each segment of the plurality of segments.
 3. Theinformation processing apparatus according to claim 2, wherein theinformation associated with the electronic content comprises bit ratesof the plurality of segments of the electronic content.
 4. Theinformation processing apparatus according to claim 1, wherein theelectronic content comprises an index of each segment.
 5. Theinformation processing apparatus according to claim 1, wherein theelectronic content comprises an index to byte range of each segment ofthe plurality of segments.
 6. The information processing apparatusaccording to claim 1, wherein the at least one segment of the electroniccontent is selected by the device based on network conditions.
 7. Theinformation processing apparatus according to claim 1, wherein therequest from the device comprises the location information for eachsegment of the plurality of segments.
 8. The information processingapparatus according to claim 1, wherein the processing circuitry isfurther configured to receive a request from the device for theinformation associated with the electronic content.
 9. The informationprocessing apparatus according to claim 1, wherein the processingcircuitry is further configured to access the requested at least onesegment based on the location information for the requested at least onesegment.
 10. The information processing apparatus according to claim 4,wherein the processing circuitry is further configured to access therequested at least one segment based on the location information for therequested at least one segment and the index of each segment of theplurality of segments.
 11. The information processing apparatusaccording to claim 1, wherein the processing circuitry is configured totransmit the requested at least one segment to the device as an HTTPresponse.
 12. The information processing apparatus according to claim 1,wherein the electronic content comprises at least one of encoded audiocontent or encoded video content.
 13. The information processingapparatus according to claim 1, wherein the information associated withthe electronic content is stored in a plurality of data filescorresponding to the plurality of different bit rates.
 14. Theinformation processing apparatus according to claim 1, wherein the datafiles correspond to an MP4 format.
 15. The information processingapparatus according to claim 1, wherein the request from the devicecomprises information identifying a predetermined bit rate of theplurality of different bit rates.
 16. The information processingapparatus according to claim 15, wherein the request from the devicerequests for segments of the plurality of segments having thepredetermined bit rate.
 17. The information processing apparatusaccording to claim 1, wherein the at least one segment of the electroniccontent is selected by the device based on bit rate of the at least onesegment.
 18. An information processing method, comprising: transmittinginformation associated with electronic content to a device, wherein theelectronic content, stored in a plurality of different bit rates,comprises a plurality of segments and the information associated withthe electronic content comprises location information for each segmentof the plurality of segments; receiving a request from the device for atleast one segment of the electronic content selected by the device basedon the location information for each segment of the plurality ofsegments; and transmitting the requested at least one segment to thedevice.
 19. A non-transitory computer-readable medium having embodiedthereon a program, which when executed by a computer causes the computerto execute a method, the method comprising: transmitting informationassociated with electronic content to a device, wherein the electroniccontent, stored in a plurality of different bit rates, comprises aplurality of segments and the information associated with the electroniccontent comprises location information for each segment of the pluralityof segments; receiving a request from the device for at least onesegment of the electronic content selected by the device based on thelocation information for each segment of the plurality of segments; andtransmitting the requested at least one segment to the device.